Skip to content

feat!: Flatten JudgeResponse and EvalScore into new LDJudgeResult#1284

Merged
jsonbailey merged 8 commits intofeat/ai-sdk-next-releasefrom
jb/aic-2200/simplify-judge-response
Apr 16, 2026
Merged

feat!: Flatten JudgeResponse and EvalScore into new LDJudgeResult#1284
jsonbailey merged 8 commits intofeat/ai-sdk-next-releasefrom
jb/aic-2200/simplify-judge-response

Conversation

@jsonbailey
Copy link
Copy Markdown
Contributor

@jsonbailey jsonbailey commented Apr 16, 2026

Summary

  • Replaces EvalScore and JudgeResponse with a flat LDJudgeResult interface (judgeConfigKey?, success, errorMessage?, sampled, score?, reasoning?, metricKey?)
  • Judge.evaluate() and evaluateMessages() now always return LDJudgeResult (never undefined). Sampling skip returns { sampled: false, success: false } instead of undefined.
  • sampled: true means the evaluation was sampled (actually run); sampled: false means it was skipped by sampling.
  • Replaces trackEvalScores and trackJudgeResponse with trackJudgeResult on both LDAIConfigTracker and LDGraphTracker. The new method guards on !result.sampled and !result.success — no event is emitted when either condition is true.
  • ChatResponse.evaluations type simplified from Promise<Array<JudgeResponse | undefined>> to Promise<LDJudgeResult[]>.
  • Public exports updated: EvalScore and JudgeResponse removed, LDJudgeResult added.

Mirrors the Python SDK change: launchdarkly/python-server-sdk-ai#132

Test plan

  • All 141 existing unit tests updated and passing
  • New tests for sampled=false (skipped) path in both config tracker and graph tracker
  • New test for success=false guard in both trackers
  • Lint clean

🤖 Generated with Claude Code


Note

Medium Risk
This is a breaking, cross-cutting API refactor that changes return types and tracking method names/semantics; downstream consumers may silently stop tracking judge metrics if they don’t propagate sampled/success or metricKey/score correctly.

Overview
Breaking API change: replaces JudgeResponse/EvalScore (evals map) with a flat LDJudgeResult (score, reasoning, metricKey, sampled, errorMessage, etc.), updating public exports accordingly.

Judge.evaluate()/evaluateMessages() now always return an LDJudgeResult (never undefined), returning explicit unsampled results when sampling skips execution and explicit error results for invalid config/provider failures; _parseEvaluationResponse now returns a single metric’s {score, reasoning} or undefined.

Tracking APIs are simplified: LDAIConfigTracker and LDGraphTracker replace trackEvalScores/trackJudgeResponse with trackJudgeResult, which only emits an event when sampled and success are true; TrackedChat and the direct-judge example/tests are updated to use the new shape and behavior.

Reviewed by Cursor Bugbot for commit 97b1ce8. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

@launchdarkly/js-sdk-common size report
This is the brotli compressed size of the ESM build.
Compressed size: 25623 bytes
Compressed size limit: 29000
Uncompressed size: 125843 bytes

@github-actions
Copy link
Copy Markdown
Contributor

@launchdarkly/browser size report
This is the brotli compressed size of the ESM build.
Compressed size: 179375 bytes
Compressed size limit: 200000
Uncompressed size: 829982 bytes

@github-actions
Copy link
Copy Markdown
Contributor

@launchdarkly/js-client-sdk-common size report
This is the brotli compressed size of the ESM build.
Compressed size: 37169 bytes
Compressed size limit: 38000
Uncompressed size: 204305 bytes

@github-actions
Copy link
Copy Markdown
Contributor

@launchdarkly/js-client-sdk size report
This is the brotli compressed size of the ESM build.
Compressed size: 31655 bytes
Compressed size limit: 34000
Uncompressed size: 112792 bytes

Comment thread packages/sdk/server-ai/src/api/judge/Judge.ts Outdated
Comment thread packages/sdk/server-ai/src/api/judge/Judge.ts Outdated
Comment thread packages/sdk/server-ai/src/api/judge/types.ts Outdated
@jsonbailey jsonbailey marked this pull request as ready for review April 16, 2026 15:10
@jsonbailey jsonbailey requested a review from a team as a code owner April 16, 2026 15:10
jsonbailey and others added 6 commits April 16, 2026 10:15
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…arseEvaluationResponse

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Jason Bailey <accounts@sidewaysgravity.com>
getTrackData() no longer accepts graphKey — it comes from the constructor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jsonbailey jsonbailey force-pushed the jb/aic-2200/simplify-judge-response branch from 6e54b30 to 574ffd7 Compare April 16, 2026 15:17
Copy link
Copy Markdown
Contributor

@joker23 joker23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only a couple of optional nits

Comment thread packages/sdk/server-ai/src/api/chat/TrackedChat.ts Outdated
Comment thread packages/sdk/server-ai/src/api/judge/types.ts Outdated
Comment thread packages/sdk/server-ai/src/api/judge/Judge.ts Outdated
jsonbailey and others added 2 commits April 16, 2026 11:50
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jsonbailey jsonbailey merged commit dd49a79 into feat/ai-sdk-next-release Apr 16, 2026
44 checks passed
@jsonbailey jsonbailey deleted the jb/aic-2200/simplify-judge-response branch April 16, 2026 17:32
@github-actions github-actions Bot mentioned this pull request Apr 20, 2026
jsonbailey added a commit that referenced this pull request Apr 21, 2026
🤖 I have created a release *beep* *boop*
---


<details><summary>browser: 0.1.16</summary>

##
[0.1.16](browser-v0.1.15...browser-v0.1.16)
(2026-04-21)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @launchdarkly/js-client-sdk bumped from 4.6.0 to 4.6.1
</details>

<details><summary>browser-telemetry: 1.0.32</summary>

##
[1.0.32](browser-telemetry-v1.0.31...browser-telemetry-v1.0.32)
(2026-04-21)


### Bug Fixes

* correct typeof comparisons in browser SDK
([#1301](#1301))
([f4bd636](f4bd636))
* **js-client-sdk:** better `undefined` handling
([#1303](#1303))
([4818678](4818678))


### Dependencies

* The following workspace dependencies were updated
  * devDependencies
    * @launchdarkly/js-client-sdk bumped from 4.6.0 to 4.6.1
</details>

<details><summary>js-client-sdk: 4.6.1</summary>

##
[4.6.1](js-client-sdk-v4.6.0...js-client-sdk-v4.6.1)
(2026-04-21)


### Bug Fixes

* correct typeof comparisons in browser SDK
([#1301](#1301))
([f4bd636](f4bd636))
* **js-client-sdk:** better `undefined` handling
([#1303](#1303))
([4818678](4818678))
</details>

<details><summary>react-sdk: 0.2.2</summary>

##
[0.2.2](react-sdk-v0.2.1...react-sdk-v0.2.2)
(2026-04-21)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @launchdarkly/js-client-sdk bumped from ^4.6.0 to ^4.6.1
</details>

<details><summary>server-sdk-ai: 0.17.0</summary>

##
[0.17.0](server-sdk-ai-v0.16.8...server-sdk-ai-v0.17.0)
(2026-04-21)


### ⚠ BREAKING CHANGES

* Flatten JudgeResponse and EvalScore into new LDJudgeResult
([#1284](#1284))
* Add per-execution runId, at-most-once tracking, and cross-process
tracker resumption
([#1270](#1270))

### Features

* Add per-execution runId, at-most-once tracking, and cross-process
tracker resumption
([#1270](#1270))
([fc25ab7](fc25ab7))
* Flatten JudgeResponse and EvalScore into new LDJudgeResult
([#1284](#1284))
([aba1221](aba1221))
* Implement agent graph definitions
([#1282](#1282))
([e7d08e5](e7d08e5))
* simplify evaluation schema to flat score/reasoning shape
([#1286](#1286))
([c132e9f](c132e9f))


### Bug Fixes

* Add support for graph metric tracking
([#1269](#1269))
([034a89d](034a89d))
</details>

<details><summary>server-sdk-ai-langchain: 0.5.5</summary>

##
[0.5.5](server-sdk-ai-langchain-v0.5.4...server-sdk-ai-langchain-v0.5.5)
(2026-04-21)


### Dependencies

* The following workspace dependencies were updated
  * devDependencies
    * @launchdarkly/server-sdk-ai bumped from ^0.16.8 to ^0.17.0
  * peerDependencies
* @launchdarkly/server-sdk-ai bumped from ^0.15.0 || ^0.16.0 to ^0.17.0
</details>

<details><summary>server-sdk-ai-openai: 0.5.5</summary>

##
[0.5.5](server-sdk-ai-openai-v0.5.4...server-sdk-ai-openai-v0.5.5)
(2026-04-21)


### Dependencies

* The following workspace dependencies were updated
  * devDependencies
    * @launchdarkly/server-sdk-ai bumped from ^0.16.8 to ^0.17.0
  * peerDependencies
* @launchdarkly/server-sdk-ai bumped from ^0.15.0 || ^0.16.0 to ^0.17.0
</details>

<details><summary>server-sdk-ai-vercel: 0.5.5</summary>

##
[0.5.5](server-sdk-ai-vercel-v0.5.4...server-sdk-ai-vercel-v0.5.5)
(2026-04-21)


### Dependencies

* The following workspace dependencies were updated
  * devDependencies
    * @launchdarkly/server-sdk-ai bumped from ^0.16.8 to ^0.17.0
  * peerDependencies
* @launchdarkly/server-sdk-ai bumped from ^0.15.0 || ^0.16.0 to ^0.17.0
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Primarily a version/changelog bump, but it publishes
`@launchdarkly/server-sdk-ai` `0.17.0` with documented breaking API
changes that can impact downstream consumers and provider peer
dependency resolution.
> 
> **Overview**
> Bumps release versions across the monorepo via
`.release-please-manifest.json`, updating `@launchdarkly/server-sdk-ai`
to `0.17.0`, `@launchdarkly/js-client-sdk` to `4.6.1`, and related
packages (`@launchdarkly/browser`, `@launchdarkly/react-sdk`,
`@launchdarkly/browser-telemetry`, and AI provider packages)
accordingly.
> 
> Updates package metadata, changelogs, examples, and embedded
SDK/wrapper version strings (e.g., `BrowserInfo` and `LDReactClient`) to
reflect the new releases, including `server-sdk-ai`’s `0.17.0`
breaking-change notes and provider peer dependency bumps to `^0.17.0`.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
e7f8c09. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: jsonbailey <jbailey@launchdarkly.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants